منابع مشابه
Training Data Cleaning for Text Classification
In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain; strategies are thus needed for maximizing the effectiveness of the resulting classifiers while minimizing the required amount of training effort. Training data cleaning (TDC) consists in devising ranking functions that sort the original training examples in terms of how...
متن کاملAn Efficient Algorithm for Data Cleaning of Log File using File Extensions
World Wide Web is a monolithic repository of web pages that provides the Internet users with heaps of information. With the growth in number and complexity of Websites, the size of web has become massively large. Web Usage Mining is a division of web mining that involves application of mining techniques to web server logs in order to extract the behavior of users. A Web Usage Mining process com...
متن کاملSuspend-aware Segment Cleaning in Log-structured File System
The suspend feature of the modern smart device practically suppresses the background segment cleaning of the log-structured file system. In this work, we develop Suspend-aware Segment Cleaning for the log-structured file system. We seamlessly integrate the segment cleaning into the suspend module of the smartphone OS so that the log-structured file system can reclaim the free segments without i...
متن کاملResearch Statement Data Cleaning Algorithmic Data-cleaning Techniques
With the increasing amount of available data, turning raw data into actionable information is a requirement in every field. However, one bottleneck that impedes the process is data cleaning. Data analysts usually spend over half of their time cleaning data that is dirty — inconsistent, inaccurate, missing, and so on — before they even begin to do any real analysis. It is a time consuming and co...
متن کاملHeuristic Cleaning Algorithms in Log-Structured File Systems
Research results show that while LogStructured File Systems (LFS) offer the potential for dramatically improved file system performance, the cleaner can seriously degrade performance, by as much as 40% in transaction processing workloads [9]. Our goal is to examine trace data from live file systems and use those to derive simple heuristics that will permit the cleaner to run without interfering...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IOSR Journal of Computer Engineering
سال: 2013
ISSN: 2278-8727,2278-0661
DOI: 10.9790/0661-0921721